Saint Kitts and Nevis
Ukrainian married couple aged 75 killed in Russian attack on Odesa
What are Russia's gains from the Iran war? 'We are not losers; we are winners' A Ukrainian married couple, both aged 75, were killed in a Russian attack on Odesa, Ukrainian officials said. Russia launched a series of drone attacks on and near Ukraine's southern port city. The assault destroyed residential buildings and hit a foreign merchant ship, according to Ukrainian authorities. A separate attack killed the married couple and wounded another, reported Ukraine's State Emergency Service. Serhiy Lysak, head of the local military administration, shared images of a building engulfed in flames and another torn open along one side, as emergency crews worked inside.
- Europe > Ukraine (1.00)
- Asia > Russia (1.00)
- South America (0.42)
- (9 more...)
- Transportation (1.00)
- Government > Regional Government > Europe Government > Ukraine Government (0.57)
- Government > Regional Government > Europe Government > Russia Government (0.38)
- Government > Regional Government > Asia Government > Russia Government (0.38)
From Anger to Joy: How Nationality Personas Shape Emotion Attribution in Large Language Models
Kamruzzaman, Mahammed, Monsur, Abdullah Al, Kim, Gene Louis, Chhabra, Anshuman
Emotions are a fundamental facet of human experience, varying across individuals, cultural contexts, and nationalities. Given the recent success of Large Language Models (LLMs) as role-playing agents, we examine whether LLMs exhibit emotional stereotypes when assigned nationality-specific personas. Specifically, we investigate how different countries are represented in pre-trained LLMs through emotion attributions and whether these attributions align with cultural norms. To provide a deeper interpretive lens, we incorporate four key cultural dimensions, namely Power Distance, Uncertainty Avoidance, Long-Term Orientation, and Individualism, derived from Hofstedes cross-cultural framework. Our analysis reveals significant nationality-based differences, with emotions such as shame, fear, and joy being disproportionately assigned across regions. Furthermore, we observe notable misalignment between LLM-generated and human emotional responses, particularly for negative emotions, highlighting the presence of reductive and potentially biased stereotypes in LLM outputs.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.92)
AI Diffusion in Low Resource Language Countries
Misra, Amit, Zamir, Syed Waqas, Hamidouche, Wassim, Becker-Reshef, Inbal, Ferres, Juan Lavista
Artificial intelligence (AI) is diffusing globally at unprecedented speed, but adoption remains uneven. Frontier Large Language Models (LLMs) are known to perform poorly on low-resource languages due to data scarcity. We hypothesize that this performance deficit reduces the utility of AI, thereby slowing adoption in Low-Resource Language Countries (LRLCs). To test this, we use a weighted regression model to isolate the language effect from socioeconomic and demographic factors, finding that LRLCs have a share of AI users that is approximately 20% lower relative to their baseline. These results indicate that linguistic accessibility is a significant, independent barrier to equitable AI diffusion.
- North America > The Bahamas (0.14)
- North America > United States > District of Columbia > Washington (0.05)
- South America > Venezuela (0.04)
- (186 more...)
Evaluating Large Language Models for IUCN Red List Species Information
Large Language Models (LLMs) are rapidly being adopted in conservation to address the biodiversity crisis, yet their reliability for species evaluation is uncertain. This study systematically validates five leading models on 21,955 species across four core IUCN Red List assessment components: taxonomy, conservation status, distribution, and threats. A critical paradox was revealed: models excelled at taxonomic classification (94.9%) but consistently failed at conservation reasoning (27.2% for status assessment). This knowledge-reasoning gap, evident across all models, suggests inherent architectural constraints, not just data limitations. Furthermore, models exhibited systematic biases favoring charismatic vertebrates, potentially amplifying existing conservation inequities. These findings delineate clear boundaries for responsible LLM deployment: they are powerful tools for information retrieval but require human oversight for judgment-based decisions. A hybrid approach is recommended, where LLMs augment expert capacity while human experts retain sole authority over risk assessment and policy.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.70)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
The Judge Variable: Challenging Judge-Agnostic Legal Judgment Prediction
This study examines the role of human judges in legal decision-making by using machine learning to predict child physical custody outcomes in French appellate courts. Building on the legal realism-formalism debate, we test whether individual judges' decision-making patterns significantly influence case outcomes, challenging the assumption that judges are neutral variables that apply the law uniformly. To ensure compliance with French privacy laws, we implement a strict pseudonymization process. Our analysis uses 18,937 living arrangements rulings extracted from 10,306 cases. We compare models trained on individual judges' past rulings (specialist models) with a judge-agnostic model trained on aggregated data (generalist models). The prediction pipeline is a hybrid approach combining large language models (LLMs) for structured feature extraction and ML models for outcome prediction (RF, XGB and SVC). Our results show that specialist models consistently achieve higher predictive accuracy than the general model, with top-performing models reaching F1 scores as high as 92.85%, compared to the generalist model's 82.63% trained on 20x to 100x more samples. Specialist models capture stable individual patterns that are not transferable to other judges. In-Domain and Cross-Domain validity tests provide empirical support for legal realism, demonstrating that judicial identity plays a measurable role in legal outcomes. All data and code used will be made available.
- South America > French Guiana > Guyane > Cayenne (0.04)
- Oceania > New Caledonia > South Province > Noumea (0.04)
- Oceania > French Polynesia > Windward Islands > Papeete (0.04)
- (15 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Law > Litigation (1.00)
- Law > Government & the Courts (0.94)
- Law > Criminal Law (0.93)
An Explainable and Interpretable Composite Indicator Based on Decision Rules
Corrente, Salvatore, Greco, Salvatore, Słowiński, Roman, Zappalà, Silvano
Composite indicators are widely used to score or classify units evaluated on multiple criteria. Their construction involves aggregating criteria evaluations, a common practice in Multiple Criteria Decision Aiding (MCDA). In MCDA, various methods have been proposed to address key aspects of multiple criteria evaluations, such as the measurement scales of the criteria, the degree of acceptable compensation between them, and their potential interactions. However, beyond producing a final score or classification, it is essential to ensure the explainability and interpretability of results as well as the procedure's transparency. This paper proposes a method for constructing explainable and interpretable composite indicators using " if..., then... " decision rules. We consider the explainability and interpretability of composite indicators in four scenarios: (i) decision rules explain numerical scores obtained from an aggregation of numerical codes corresponding to ordinal qualifiers; (ii) an obscure numerical composite indicator classifies units into quantiles; (iii) given preference information provided by a Decision Maker in the form of classifications of some reference units, a composite indicator is constructed using decision rules; (iv) the classification of a set of units results from the application of an MCDA method and is explained by decision rules. To induce the rules from scored or classified units, we apply the Dominance-based Rough Set Approach. The resulting decision rules relate the class assignment or unit's score to threshold conditions on values of selected indicators in an intelligible way, clarifying the underlying rationale. Moreover, they serve to recommend composite indicator assessment for new units of interest.
- Asia > Laos (0.14)
- South America > Brazil (0.04)
- Africa > Senegal (0.04)
- (182 more...)
- Health & Medicine (1.00)
- Materials > Metals & Mining (0.93)
GIMMICK -- Globally Inclusive Multimodal Multitask Cultural Knowledge Benchmarking
Schneider, Florian, Holtermann, Carolin, Biemann, Chris, Lauscher, Anne
Large Vision-Language Models (LVLMs) have recently gained attention due to their distinctive performance and broad applicability. While it has been previously shown that their efficacy in usage scenarios involving non-Western contexts falls short, existing studies are limited in scope, covering just a narrow range of cultures, focusing exclusively on a small number of cultural aspects, or evaluating a limited selection of models on a single task only. Towards globally inclusive LVLM research, we introduce GIMMICK, an extensive multimodal benchmark designed to assess a broad spectrum of cultural knowledge across 144 countries representing six global macro-regions. GIMMICK comprises six tasks built upon three new datasets that span 728 unique cultural events or facets on which we evaluated 20 LVLMs and 11 LLMs, including five proprietary and 26 open-weight models of all sizes. We systematically examine (1) regional cultural biases, (2) the influence of model size, (3) input modalities, and (4) external cues. Our analyses reveal strong biases toward Western cultures across models and tasks and highlight strong correlations between model size and performance, as well as the effectiveness of multimodal input and external geographic cues. We further find that models have more knowledge of tangible than intangible aspects (e.g., food vs. rituals) and that they excel in recognizing broad cultural origins but struggle with a more nuanced understanding.
- South America > Colombia (0.28)
- Africa > Republic of the Congo (0.28)
- Europe > Germany (0.14)
- (169 more...)
- Law (1.00)
- Health & Medicine (1.00)
- Media (0.92)
- (3 more...)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
What is in a name? Mitigating Name Bias in Text Embeddings via Anonymization
Manchanda, Sahil, Shivaswamy, Pannaga
Text-embedding models often exhibit biases arising from the data on which they are trained. In this paper, we examine a hitherto unexplored bias in text-embeddings: bias arising from the presence of $\textit{names}$ such as persons, locations, organizations etc. in the text. Our study shows how the presence of $\textit{name-bias}$ in text-embedding models can potentially lead to erroneous conclusions in assessment of thematic similarity.Text-embeddings can mistakenly indicate similarity between texts based on names in the text, even when their actual semantic content has no similarity or indicate dissimilarity simply because of the names in the text even when the texts match semantically. We first demonstrate the presence of name bias in different text-embedding models and then propose $\textit{text-anonymization}$ during inference which involves removing references to names, while preserving the core theme of the text. The efficacy of the anonymization approach is demonstrated on two downstream NLP tasks, achieving significant performance gains. Our simple and training-optimization-free approach offers a practical and easily implementable solution to mitigate name bias.
- Europe > France (0.04)
- Asia > India (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (145 more...)
- Health & Medicine (1.00)
- Law (0.67)
CultureVLM: Characterizing and Improving Cultural Understanding of Vision-Language Models for over 100 Countries
Liu, Shudong, Jin, Yiqiao, Li, Cheng, Wong, Derek F., Wen, Qingsong, Sun, Lichao, Chen, Haipeng, Xie, Xing, Wang, Jindong
Vision-language models (VLMs) have advanced human-AI interaction but struggle with cultural understanding, often misinterpreting symbols, gestures, and artifacts due to biases in predominantly Western-centric training data. In this paper, we construct CultureVerse, a large-scale multimodal benchmark covering 19, 682 cultural concepts, 188 countries/regions, 15 cultural concepts, and 3 question types, with the aim of characterizing and improving VLMs' multicultural understanding capabilities. Then, we propose CultureVLM, a series of VLMs fine-tuned on our dataset to achieve significant performance improvement in cultural understanding. Our evaluation of 16 models reveals significant disparities, with a stronger performance in Western concepts and weaker results in African and Asian contexts. Fine-tuning on our CultureVerse enhances cultural perception, demonstrating cross-cultural, cross-continent, and cross-dataset generalization without sacrificing performance on models' general VLM benchmarks. We further present insights on cultural generalization and forgetting. We hope that this work could lay the foundation for more equitable and culturally aware multimodal AI systems.
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines
Winata, Genta Indra, Hudi, Frederikus, Irawan, Patrick Amadeus, Anugraha, David, Putri, Rifki Afina, Wang, Yutong, Nohejl, Adam, Prathama, Ubaidillah Ariq, Ousidhoum, Nedjma, Amriani, Afifa, Rzayev, Anar, Das, Anirban, Pramodya, Ashmari, Adila, Aulia, Wilie, Bryan, Mawalim, Candy Olivia, Cheng, Ching Lam, Abolade, Daud, Chersoni, Emmanuele, Santus, Enrico, Ikhwantri, Fariz, Kuwanto, Garry, Zhao, Hanyang, Wibowo, Haryo Akbarianto, Lovenia, Holy, Cruz, Jan Christian Blaise, Putra, Jan Wira Gotama, Myung, Junho, Susanto, Lucky, Machin, Maria Angelica Riera, Zhukova, Marina, Anugraha, Michael, Adilazuarda, Muhammad Farid, Santosa, Natasha, Limkonchotiwat, Peerat, Dabre, Raj, Audino, Rio Alexander, Cahyawijaya, Samuel, Zhang, Shi-Xiong, Salim, Stephanie Yulia, Zhou, Yi, Gui, Yinxuan, Adelani, David Ifeoluwa, Lee, En-Shiun Annie, Okada, Shogo, Purwarianti, Ayu, Aji, Alham Fikri, Watanabe, Taro, Wijaya, Derry Tanti, Oh, Alice, Ngo, Chong-Wah
Vision Language Models (VLMs) often struggle with culture-specific knowledge, particularly in languages other than English and in underrepresented cultural contexts. To evaluate their understanding of such knowledge, we introduce WorldCuisines, a massive-scale benchmark for multilingual and multicultural, visually grounded language understanding. This benchmark includes a visual question answering (VQA) dataset with text-image pairs across 30 languages and dialects, spanning 9 language families and featuring over 1 million data points, making it the largest multicultural VQA benchmark to date. It includes tasks for identifying dish names and their origins. We provide evaluation datasets in two sizes (12k and 60k instances) alongside a training dataset (1 million instances). Our findings show that while VLMs perform better with correct location context, they struggle with adversarial contexts and predicting specific regional cuisines and languages. To support future research, we release a knowledge base with annotated food entries and images along with the VQA data.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- North America > United States (0.14)
- Asia > Brunei (0.14)
- (170 more...)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.70)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.48)